hMDAP: A Hybrid Framework for Multi-paradigm Data Analytical Processing on Spark

نویسندگان

  • Xiaowang Zhang
  • Jiahui Zhang
  • Zhiyong Feng
چکیده

We propose hMDAP, a hybrid framework for large-scale data analytical processing on Spark, to support multi-paradigm process (incl. OLAP, machine learning, and graph analysis etc.) in distributed environments. The framework features a three-layer data process module and a business process module which controls the former. We will demonstrate the strength of hMDAP by using traffic scenarios in a real world.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Fuzzy Multi-criteria Decision Making Model Based on Fuzzy DEMATEL with Fuzzy Analytical Network Process and Interpretative Structural Model for Prioritizing LARG Supply Chain Practices

In recent years, taking advantage of LARG supply chain (SC) paradigm, a combination of four paradigms (clean, agile, resilience and green) has been increasingly employed. For capturing the advantages of LARG in SC, companies needed to recognize proper practices and implement them with appropriate planning and infrastructure. However, one of its deficiencies is lack of proper method in the prior...

متن کامل

Evaluation of New Urbanism Principles: Hybrid AHP-TOPSIS Multi-Criteria Analysis Framework (Case Study: Neighborhoods of Historical Zone of Shiraz)

A large number of studies show that uncontrolled and unplanned spatial and functional developmentsin old central districts of large and medium-sized Iranian cities in past few decades have resulted in many unpleasanttransformations. Thus, the historic urban fabrics have gradually lost their socio-economic livability and quality and thisissue has led to urban distress, blight and deterioration.T...

متن کامل

Identifying the potential of Near Data Computing for Apache Spark

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest is Near Data Computing (NDC) due to technological advancement in the last decade. However, it is not known if ...

متن کامل

Cloudflow - enabling faster biomedical pipelines with MapReduce and Spark

For many years Apache Hadoop has been used as a synonym for processing data in the MapReduce fashion. However, due to the complexity of developing MapReduce applications, adoption of this paradigm in genetics has been limited. To alleviate some of the issues, we have previously developed Cloudflow a high-level pipeline framework that allows users to create sophisticated biomedical pipelines usi...

متن کامل

Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1701.04182  شماره 

صفحات  -

تاریخ انتشار 2017